Abstract
Background: B-Cell Maturation Antigen (BCMA)-directed CAR-T therapy has transformed the treatment paradigm for patients with relapsed/refractory multiple myeloma (RRMM), yet 30-50% of patients progress within 12 months, and PET-positive extramedullary disease emerges in one-third, limiting durable disease control. In parallel, an expanding array of bispecific antibodies, next-generation CAR constructs, and trials offer actionable alternatives for patients predicted to fail standard BCMA-CAR-T. Contemporary prognostic scores derived from limited clinical variables offer modest discrimination and rarely inform risk-adapted care. We therefore investigated whether an explainable multimodal artificial-intelligence (MAI) framework that integrates clinical, serologic, cytogenetic, and quantitative imaging could sharpen early risk prediction after BCMA-CAR-T.
Methods: Twenty-seven baseline variables were captured, including pre-lymphodepletion (pre-LD) circulating serum soluble BCMA (sBCMA; R&D Systems, Minneapolis, MN; catalog no. DY193), ferritin, C-reactive protein, β2-microglobulin, absolute lymphocyte count (ALC), ISS stage, plasma cell high-risk fluorescence-in-situ hybridization (del17p, t(4;14), t(14;16), chromosome 1 abnormalities), and metabolic tumor volume (MTV) extracted from pre-LD ¹⁸F-FDG PET/CT scans as previously described (Freeman Blood 2024). Patients with complete data formed the modelling cohort. Explainable machine learning algorithms based on Elastic Net, Random Survival Forest (RSF), and Gradient-Boosting Survival Machine (GBSM) models were trained with 5-fold cross-validation with multiple randomized initializations to mitigate overfitting bias. Harrell's concordance index (C-index) quantified prognostic accuracy for progression-free (PFS) and overall survival (OS). Performance was benchmarked against existing risk models: MyCARe (Gagelmann, JCO, 2024), Stratification of CAR-T Outcomes at Pre-Apheresis Evaluation (SCOPE), and established tumor burden measurements (TMB) based on soluble BCMA (sBCMA) and PET-derived pre-treatment MTV (Freeman, Blood, 2024). Predictor importance was interrogated with permutation analysis and SHAP values. The Nelson-Aalen estimator was used for accumulated risk analysis and compared MAI-derived risk strata.
Results: We retrospectively analyzed 183 consecutive RRMM patients infused with idecabtagene vicleucel or ciltacabtagene autoleucel between May 5th, 2021, and June 5th, 2024. Median duration of follow-up of all living patients was 22.1 months (range 2.8-44.1), and baseline patient demographics have been previously published and aligned with real-world expectations (Freeman Blood 2024). The Elastic net achieved c-index of 0.625 ± 0.125 and 0.635 ± 0.170 for PFS and OS, respectively, while GBSM yielded c-index of 0.690 ± 0.089 and 0.641 ± 0.179, respectively. A fine-tuned RSF slightly outperformed other MAI models, delivered c-indices of 0.701 ± 0.073 (PFS) and 0.674 ± 0.192 (OS). Collectively, these MAI models outperformed existing conventional scores of MyCARe (0.611/0.627), SCOPE (0.612/0.633), and tumor-burden (TMB,0.629/0.467). For instance, the RSF MAI model stratified patients into low-, intermediate-, and high-risk groups with 12-month progression risks of 12.8 %, 47.9 %, and 85.0 %, corresponding to PFS rates of 87.2 %, 52.1 %, and 15.0 %, respectively (log-rank p < 0.001).
The most influential features for PFS in the RSF MAI model included pre-LD sBCMA, pre-LD ferritin, and ALC at apheresis, with MTV also contributing to the performance of the overall model. Dominating OS features were pre-LD sBCMA, pre-LD albumin and LDH. GBSM identified overlapping features with the addition of β2-microglobulin for PFS and CRP for OS.
Conclusion: This explainable multimodal-AI platform already outperforms available clinically derived prognostic scores by unifying tumor-burden, inflammatory, biomarker and serological signals; ongoing expansion is already underway to incorporate whole-genome sequencing, digital pathology, and longitudinally collected data which is expected to yield an even more powerful, continuously learning risk-engine that can guide patient management, adaptive trial design, inform pre-emptive intervention strategies, and ultimately individualize management across a highly diverse myeloma population. These findings support further integration of multimodal AI for precision risk stratification in RRMM and warrant prospective validation in larger cohorts.